Universal SIMD-Mathlibrary

نویسنده

  • Helmut Dersch
چکیده

Standard functions for single precision floating point vector datatypes are provided for the SIMD-platforms x86 (SSE2), PowerPC and Cell. In most cases, speed and/or accuracy compare favourable with existing SIMDlibraries (MacOS Accelerate Framework, Cell SDK). Most of the algorithms are based on those of the Cephes library, while the implementation is branchfree and parallelized for minimum pipeline stalls. The Universal SIMD Mathlibrary (usm) provides the functions sin, cos, tan, asin, acos, atan, atan2, sqrt, exp, log, pow, abs, ceil, floor, ldexp, and frexp. It is licensed under the GPL3.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UMAC: Fast and Secure Message Authentication

We describe a message authentication algorithm, UMAC, which can authenticate messages (in software, on contemporary machines) roughly an order of magnitude faster than current practice (e.g., HMAC-SHA1), and about twice as fast as times previously reported for the universal hash-function family MMH. To achieve such speeds, UMAC uses a new universal hash-function family, NH, and a design which a...

متن کامل

Regular and almost universal hashing: an efficient implementation

Random hashing can provide guarantees regarding the performance of data structures such as hash tables— even in an adversarial setting. Many existing families of hash functions are universal: given two data objects, the probability that they have the same hash value is low given that we pick hash functions at random. However, universality fails to ensure that all hash functions are well behaved...

متن کامل

Modeling Universal Instruction Selection

Instruction selection implements a program under compilation by selecting processor instructions and has tremendous impact on the performance of the code generated by a compiler. This paper introduces a graph-based universal representation that unifies data and control flow for both programs and processor instructions. The representation is the essential prerequisite for a constraint model for ...

متن کامل

A Programmable, Scalable-Throughput Interleaver

The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleav...

متن کامل

Concurrent Processing Memory

A theoretical memory with limited processing power and internal connectivity at each element is proposed. This memory carries out parallel processing within itself to solve generic array problems. The applicability of this in-memory finest-grain massive SIMD approach is studied in some details. For an array of N items, it reduces the total instruction cycle count of universal operations such as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008